Fuzzy communities and the concept of bridgeness in complex networks 
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We consider the problem of fuzzy community detection in networks, which complements and 
expands the concept of overlapping community structure. Our approach allows each vertex of 
the graph to belong to multiple communities at the same time, determined by exact numerical 
membership degrees, even in the presence of uncertainty in the data being analyzed. We created an 
algorithm for determining the optimal membership degrees with respect to a given goal function. 
Based on the membership degrees, we introduce a new measure that is able to identify outlier 
vertices that do not belong to any of the communities, bridge vertices that belong significantly to 
more than one single community, and regular vertices that fundamentally restrict their interactions 
within their own community, while also being able to quantify the centrality of a vertex with respect 
to its dominant community. The method can also be used for prediction in case of uncertainty in 
the dataset analyzed. The number of communities can be given in advance, or determined by the 
algorithm itself using a fuzzified variant of the modularity function. The technique is able to discover 
the fuzzy community structure of different real world networks including, but not limited to social 
networks, scientific collaboration networks and cortical networks with high confidence. 



I. INTRODUCTION 



Recent studies revealed that graph models of many 
real world phenomena exhibit an overlapping community 
structure, which is hard to grasp with the classical graph 
clustering methods where every vertex of the graph be- 
longs to exactly one community This is especially 
true for social networks, where it is not uncommon that 
individuals in the network belong to more than one com- 
munity at the same time. Individuals who connect groups 
in the network function as "bridges" , hence the concept 
of "bridge" is defined as the vertices that cross structural 
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holes between discrete groups of people [1]. It is there- 
fore important to define a quantity that measures the 
commitment of a node to several communities in order 
to obtain a more realistic view of these networks. 

The intuitive meaning of a bridge vertex may differ in 
different kinds of networks that exist beyond sociomet- 
rics. In protein interaction networks, bridges can be pro- 
teins with multiple roles. In cortical networks containing 
brain areas responsible for different modalities (for in- 
stance, visual and tactile input processing), the bridges 
are presumably the areas that take part in the integration 
and higher level processing of sensory signals. In word 
association networks, words with multiple meanings are 
likely to be bridges [1^1 ■ The state-of-the-art overlap- 
ping community detection algorithms P, H, H, [1] are not 
able to quantify the notion of bridgeness, while other at- 
tempts at quantifying it (e.g., the participation index (j,]) 
are only concerned with non-overlapping communities. 

To emphasize the importance of bridge vertices in com- 
munity detection and to illustrate the concept, we take 
a simple graph shown on Fig. 1(a) as an example. A 
visual inspection of this graph most likely suggests two 
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Figure 1: Panel (a): a simple graph that is unable to be 
partitioned into two communities without allowing overlaps or 
outliers. Panel (6): the dendrogram of the graph as calculated 
by the greedy modularity optimization algorithm 7j]. The 
dashed line denotes the level where the dendrogram should 
be cut in order to reach the maximal modularity (denoted by 
<!)■ 



densely connected communities, with vertex 5 standing 
somewhere in between, belonging to both of them at the 
same time. One may argue that vertex 5 itself forms a 
separate community, but a community with only a sin- 
gle node is usually not meaningful (and we can also easily 
add more edges connecting the two communities to vertex 
5 to emphasize its sharedness) . This property of vertex 
5 is not revealed by any classical community detection 
algorithm without accounting for overlaps or outliers. 

Hierarchical algorithms build a dendrogram from the 
vertices by joining them to communities one by one (or 
starting from the opposite direction, splitting the graph 
into two subcommunities, then splitting the subcommu- 
nities again until every vertex forms a single commu- 
nity). For instance, the modularity optimization algo- 
rithm of Clauset et al Q repeatedly merges individual 
vertices or already created communities to form bigger 
ones in a way that greedily maximizes the modularity 
of the achieved partition (for the definition of modular- 
ity, see [1] or Eq. [13]) . Using this algorithm, vertex 5 was 
merged to vertex 4 right at the first step of the algorithm, 
misleadingly suggesting that they can not be separated 
from each other. The complete dendrogram is shown on 



Fig. 1(b) 



A better solution can be achieved by applying the 
clique percolation method (CPM) of Palla et al [l| , which 
is also able to discover overlapping communities. In this 
case, vertex 5 was classified as an outlier (a vertex that 
does not belong to any community). This result stands 
closer to our visual inspection and clearly underlines the 
fact that in many cases, wc should not assume that a 
vertex belongs to one and only one community in the 
graph. However, vertex 5 is not an outlier in the sense 
that removing it from the network would result in two 
disconnected components. Vertex 5 is an integral part of 
the network, serving as the only connection between two 
densely connected subgroups. 



II. METHODS 

A. Fuzzy community detection as a constrained 
optimization problem 

The objective of classical community detection in net- 
works is to partition the vertex set of the graph G{V, E) 
into c distinct subsets in a way that puts densely con- 
nected groups of vertices in the same community, c can 
either be given in advance or determined by the commu- 
nity detection algorithm itself. For the time being, let 
us assume that c is known. In this case, a convenient 
representation of a given partition is the partition matrix 
U = [uik] 9]. U has N — \V\ columns and c rows, and 
Uik = 1 if and only if vertex k belongs to the ith subset 
in the partition, otherwise it is zero. From the definition 
of the partition, it clearly follows that X)i=i '^ik = 1 for 
all 1 < fc < iV. The size of community i can then be cal- 
culated as J2^=i^ikj and for any meaningful partition, 
we can assume that < X^fcLi "ifc < These partitions 
are traditionally called hard or crisp partitions, because 
a vertex can belong to one and only one of the detected 
communities 

The generalization of the hard partition follows by al- 
lowing Uik to attain any real value from the interval [0, 1]. 
The constraints imposed on the partition matrix remain 
the same llOll: 



u^k e [0, 1] for all 1 < i < c, 1 < fc < iV 

C 

^ Mjfc = 1 for all 1 < fc < 



i=l 
N 



<'^Uik < N for all 1 < I < c. 



(la) 
(lb) 

(Ic) 



fc=i 



Eg. llbl simplv states that the total membership degree for 
each vertex must be equal to 1. Informally, this means 
that vertices have a total membership degree of 1, which 
will be distributed among the communities. Eq. [Tc] is 
the formal description of a simple requirement: we are 
not interested in empty communities (to which no vertex 
belongs to any extent), and we do not want all vertices 
to be grouped into a single community. Partitions of this 
type are called fuzzy partitions. The fuzzy membership 
degrees for a given vertex can be thought about as a 
trait vector that describes some (possibly nonobservable) 
properties of the entity which the vertex represents in a 
compact manner. Trait-based graph models have already 
been suggested as models for complex networks )ll| . 

Since the groundbreaking work of Dunn [12] and 
Bezdek j9| on the fuzzy c-means clustering algorithm, 
many methods have been developed to search for fuzzy 
clusters in multi-dimensional datasets. For an overview 
of these methods, see Bezdek and Pal [ij]. However, 
these methods usually require a distance function defined 
in the space the data belong to, therefore it is impossible 
to apply them to graph partitioning directly, except in 



3 



cases where the vertices of the graph are embedded in 
an n-dimensional space. A recent paper of Zhang et al 
discusses a possible embedding of the vertices of an 
arbitrary graph into an n-dimensional space using spec- 
tral mapping in order to utilize the fuzzy c-means algo- 
rithm on graphs. They were able to identify meaning- 
ful fuzzy communities in several well-known test graphs 
(e.g., the Zachary karate club network [l^] and the net- 
work of American college football teams [lH), but the 
eigenvector calculations involved in the algorithm render 
it computationally expensive to use on large networks. 

To overcome the need of spatial embedding, we propose 
a different approach based on vertex similarities. We 
observe that a meaningful partition (let it be hard or 
fuzzy) should group vertices that are somehow similar to 
each other in the same community. It is reasonable to 
assume that an edge between vertex vi and V2 implies 
the similarity of vi and V2, and likewise, the absence of 
an edge implies dissimilarity. Let us assume that we have 
a function s{\J,i,j) that satisfies the following criteria: 

1. siV,i,j) G [0,1] 

2. s(U, i,j) is continuous and differentiable for all Uij. 

3. s(U, i,j) = 1 if the membership values of Vi and Vj 
suggest that they are as similar as possible. 

4. s(U, i, j) — a the membership values of Vi and Vj 
suggest that they are completely dissimilar (there 
is no chance that they belong to the same commu- 
nity). 

Let us call such s(U,j,j) a similarity function, and 
for the sake of simplicity, we simply denote it by Sij from 
now on (not emphasizing its dependence on U). Suppose 
we have a prior assumption about the actual similarity 
of the vertices, denoted by Sij for Vi and Vj. This leads 
us to the following equation, which measures the fitness 
of a given partition U of graph G{V, E) by quantifying 
how precisely it approximates the prescribed similarity 
values with s, 



N N 

i=i j=i 



(2) 



where 's are optional weights and N = \V\ is the num- 
ber of vertices in the graph. For the sake of notational 
simplicity, we also introduce the matrices W = [wij], 
S(U) — [sij] and S ~ [sij]. From now on, we assume 
that S — Ag, the adjacency matrix of the graph, in con- 
cordance with our assumption that the similarity of con- 
nected vertex pairs should be close to 1 and the similarity 
of disconnected vertex pairs should be close to zero. The 
only thing left is to precisely define a similarity function 
Sij that satisfies the conditions prescribed above. The 
definition we used was the following: 



It easily follows that S(U) = [sij] = U'^U [s^. 

In summary, the community detection problem in this 
framework boils down to the optimization of Z?g(U) de- 
fined in Eq. [2| we must find U that minimizes Dq(\J) 
while satisfying the conditions of Eq. [T] The number of 
clusters c, the weight matrix W and the desired similar- 
ities S are given in advance (the latter one most com- 
monly equals to the adjacency matrix A^). This is a 
nonlinear constrained optimization problem. Although 
there exist a set of necessary conditions that restrict the 
set of possible U's worth evaluating [l^, the com- 
putationally most feasible approach to optimize Dg(\J) 
is to use a gradient-based iterative optimization method 
(e.g., simulated annealing). The equality constraints in 
Eq. Ilbl can be incorporated into the goal function by La- 
grangian multipliers A = [Ai, A2, . . . Xn], resulting in the 
following modified goal function: 

N N 

DaiU, A) = ^ X! ~ + 

1=1 j=i 

N / c \ 



(4) 
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The modified goal function compactly encodes the 
original goal function and the constraints, since 
V„,^i)G(U,A) = (for aU 1 < i < c and 1 < j < TV) 
ensures that we are at a stationary point of the goal 
function, and Va£'g(U, A) = ensures that we satisfy 
the conditions of Eq. Ilbl Therefore, stationary points of 
Eq. m will also be stationary points of Eq. [2] and they do 
not violate Eq. Ilbl 

To employ a gradient-based iterative optimization 
method, we need the derivatives of the goal function with 
respect to Uki. First we note that 



dsi 



d 



(ukiUkj) 



duki duki 
which is zero, except when i — I or j — I: 



(5) 
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(6) 



The partial derivative of I?g(U, A) with respect to Uki is 
therefore 



duki 



N 



= -2'^Wii{sii - su) 



Uki 



1=1 

N 



-2'^wij{sij - sij)ukj + Xi (7) 



k=l 



(^3) Let Cij = Wij {sij — Sij). Summing the partial deriva- 

tives for k = 1, 2, ... c, making them equal to zero and 



4 



substituting Eq. Ilbl back where appropriate leaves us 
with: 



N 



(8) 



The substitution of Eq. [5]into Eq. [7] yields one compo- 
nent of the goal function's gradient vector: 



dD. 



G 



du 



kl 



= 2^(ei; + e;,;) f Uk 



(9) 



The simplest gradient-based algorithm for finding a lo- 
cal minimum of Dq is then the following: 

1. Start from an arbitrary random partition U^''^ and 
let t = 0. 

2. Calculate the gradient vector of Dq according to 
Eq.IHand the current U(*). 

3. If maxfc^j If^l ^ ^' ^^°P iteration and declare 
U'*' a solution. 

4. Otherwise, calculate the next partition in the iter- 
ation with the following equation: 



,(t+i) 



,(*) 



dPc 

dui. 



(10) 



where a*^*-' is a small step size constant chosen ap- 
propriately. 

5. Increase t and continue from step 2. 

a^*-* can be determined by a line search towards the di- 
rection defined by the gradient vector, it can be adjusted 
iteratively according to some simulated annealing sched- 
ule (see for a comparison of strategies) , or it can be 
made adaptive from iteration to iteration by checking the 
difference of the values of the goal function in the last few 
steps: the step size can be increased if the value of the 
goal function decreased, and it must be decreased if the 
value of the goal function increased. We must also make 
sure that the procedure does not end up in a saddle point 
or a local maximum of Dq{\J) [37| . 

According to our simulations, the quality of the re- 
sult is not affected by the initial membership degrees, 
but the speed of convergence is. In the extreme case, if 
we choose all Uij to be equal to 1/c, all the gradients 
will be zero (see Eq. IH]), therefore it is suggested to use 
a randomized initial partition matrix. The best results 
can be achieved by choosing the initial membership de- 
grees from a uniform distribution while still satisfying 
the sum constraints. Uniformity with respect to the con- 
straints is not straightforward to achieve. The intuitive 
approach is to choose a random number from the interval 
[0, 1] for every and divide them with their respective 
column sums to satisfy Eq. Ilbl However, this method 
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Figure 2: Running time of our algorithm as a function of 
the number of vertices in a graph with 4 communities. The 
hardware used for calculation was an 1.83 GHz Intel Core Duo 
MacBook. Fitting f{x) = ax^ resulted in the parameters 
a = 2.3 X 10"^ ± 1 X 10"^ and b = 1.968 ± 4.24 x 10"^ 
(standard deviation from the fitted curve — 2.583), confirming 
our reasoning on the quadratic running time of the algorithm. 



is biased towards membership vectors describing vertices 
equally participating in every community. The proper 
way to sample from all possible membership vectors is 
to draw every vector from a Dirichlet distribution with 
order c and a = [1, 1, . . . , 1] where a has c coordinates. 
Such a distribution can be generated by drawing c inde- 
pendent random samples from gamma distributions each 
with shape and scale parameters equal to 1, and dividing 
each variable with the sum of all of them |19| . 

With N vertices and c communities, the time com- 
plexity of calculating the initial membership is 0{Nc), 
calculating the gradient vectors in each step is 0{N'^c), 
choosing the maximum gradient component for each ver- 
tex is 0{Nc) and calculating the next partition matrix 
is 0{Nc), assuming that the step size can be chosen in 
0(1) (which is true for simulated annealing strategies 
or adaptive step sizes based on the decline of the goal 
function between subsequent steps). This results in an 
overall time complexity of 0{N^ch), where h is the num- 
ber of steps necessary for the algorithm to terminate, 
meaning that the calculation time is expected to scale 
quadratically with the number of vertices if N ^ c, which 
is confirmed by our measurements. The time complex- 
ity of our implementation (Fig. [2]) is slightly worse than 
that of spectral methods, where an almost linear time 
complexity can be achieved by, e.g., using the implicitly 
restarted Arnoldi method [20| to compute some of the 
largest eigenvectors. 

For the sake of completeness, we show that U^*"'"^^ re- 
mains a partition matrix if U^*-* was a partition matrix. 
We recall that a partition matrix satisfies Eq. [Ta| and 
Eq. Ilbl In the first step, we choose U'-^-' that satisfies 
Eq. [Tc] The persistence of Eq. [Ta| and Eq. [Tc] is straight- 
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forward if we always keep a^*-* low enough, so we only 
have to prove the persistence of Eq. Ilbl 
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(11) 



B. The concept of bridgeness 



One of the advantages of fuzzy community detection is 
that it enables us to analyze to what extent a given vertex 
is shared among different communities. This measure is 
called bridgeness. Intuitively, a vertex that belongs to 
only one of the communities has zero bridgeness, while 
a vertex that belongs to all of the communities exactly 
to the same extent has a bridgeness of 1. We define the 
bridgeness of a vertex Vi as the distance of its membership 
vector Ui = [uii,U2i, ■ ■ ■ ,Uci\ from the reference vector 
[i, i, . . . , i] in the Euclidean vector norm [sll, inverted 
and normalized to the interval [0, 1] as follows: 



b,. = l- 



\ 



Uji 

c 



(12) 



Note that hi attains its theoretical maximum when Vi 
belongs to all of the communities exactly with the same 
membership degree, therefore it is possible that in this 
case, Vi is more likely to be an outlier in the graph (a ver- 
tex belonging to none of the communities) rather than 
a bridge. To distinguish outliers and real bridges, one 
should also look at the centrality measures of the node: 
high centrality supports the assumption that the vertex 
is effectively a bridge, because despite its central role in 
the network, the algorithm was not able to assign it to 
a single community. Low centrality may mean that the 
algorithm strived to make the vertex dissimilar from al- 
most all other vertices, therefore it made it belong to all 
the communities. The simplest measure that incorpo- 
rates centrality and bridgeness score into a single num- 
ber is simply defined as the product of the degree and the 
bridgeness of the node, and will be called degree- corrected 
bridgeness from now on. Other centrality measures (e.g. 
betweenness centrality, closeness centrality or eigenvec- 
tor centrality) can also be used. More sophisticated cen- 
trality measures take into account that several networks 
contain vertices that have a crucial role but a relatively 
low degree (e.g. metabolic networks, as shown in [^). 



We also suggest to plot a chosen centrality measure ver- 
sus the bridgeness score for each vertex to visually aid 
the selection of bridge vertices and outliers. An example 
of this kind of plot will be shown in Section [TVl on Fig. [S] 
Bridgeness can either be used in benchmarks to assess 
how sensitive the algorithm is to structural overlaps, or 
in the analysis of real data to gain information about the 
roles of the vertices in the network. Vertices with high 
centrality and bridgeness scores close to zero are likely 
to be in the cores of the communities, while bridgeness 
scores close to one with a high centrality suggest vertices 
standing in a bridgelike position between communities. 
In this sense, substracting the bridgeness score from 1 
and multiplying it by an appropriate centrality measure 
results in a measure of the centrality of the vertex with 
respect to its own communities in the network, similarly 
to the measure introduced in [2lj . Benchmark results and 
the application of bridgeness in data analysis is presented 
in Section ITVl 



III. PARAMETRIZATION OF THE 
ALGORITHM 

At first glance, it may seem difficult to select the ap- 
propriate value for each parameter of the algorithm de- 
scribed in the previous section. However, most of these 
parameters have reasonable default values that can be 
used in most cases. The only exception is the number 
of clusters c, for which we will describe a simple process 
to identify its most suitable value. In this section, we 
explain the key ideas one should consider when choosing 
the appropriate values for the parameters. 



A. Choosing the number of communities 

The first and most important parameter of the method 
is c, defining the number of communities the algorithm 
tries to discover in the network. This parameter is the 
keystone of most community detection algorithms, and 
determining c in a self-consistent way without human in- 
tervention is definitely a complicated problem. Spectral 
methods rely on the largest eigenvalues of the adjacency 
matrix Aq or the smallest eigenvalues of the Laplacian 
matrix Lg = Aq — Dg (where Dq is a diagonal matrix 
with diagonal elements ki, the degrees of the vertices) 
to define the number of communities, but this is usually 
done by visual inspection, and since the eigenspectrum 
of most networks found in real applications resemble a 
straight line instead of a step function, choosing c is not 
free of subjective elements. For instance, the number of 
eigenvalues of the Laplacian matrix of a graph that are 
close to zero are often used as the value of c, but this only 
replaces the value of c with another parameter: a thresh- 
old level that decides which eigenvalues are considered to 
be close to zero. The threshold is then chosen manually. 

In order to get rid of the human intervention needed to 
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choose c based on the eigenvalues, we propose a different, 
divisive approach which also spares some computation in 
the early stage of the algorithm. Initially, we compute 
a fuzzy bisection of the graph by setting c ~ 2. Af- 
ter that, whenever the optimization gets stuck in a local 
minimum, we add another degree of freedom to the sys- 
tem by increasing c and continue with the optimization 
from the last local minimum until it converges again. 
We keep on increasing the number of communities un- 
til we find that the newly introduced community does 
not improve the overall community structure of the net- 
work (after the algorithm has settled down again in a 
minimum). The community structure is assessed by the 
fuzzification of the modularity function. The modularity, 
originally introduced in 's'l, defines how good a commu- 
nity structure is by evaluating the difference between the 
observed intra-community edge density and the expected 
one based on a random graph model conditioned on the 
degree sequence of the network. In a random graph with 
exactly the same degree sequence as the original graph, 
the probability of the existence of an edge between ver- 
tices i and j is kikj/2m, where ki is the degree of vertex 
i and m is the total number of edges in the network. The 
original, "crisp" modularity of a network with vertex i 
belonging to community c{i) is then defined as: 



Q =—y 



A,. 



2m 



^c(i),c(j) 



(13) 



where (5c(i),c(j) is 1 if vertex i and j belong to the same 
community (c(i) — c{j)), otherwise. Since the com- 
munity structure in our algorithm is not clear-cut, the 
following predicate: "vertex i and j belongs to the same 
community" also has a fuzzy truth value between and 
1. When the membership degree Uki is considered the 
probability of the event that vertex i is in community k, 
the probability of the event that vertex i belongs to the 
same community as vertex j becomes the dot product of 
their membership vectors, resulting in the already intro- 
duced similarity measure Sij , which can be used in place 
of ^c(i),c(j) to obtain a fuzzified variant of the modularity: 



Qf 



1 

2m 



E 



A,, 



kj k 7 



2m 



(14) 



Note that in the case of crisp communities (there exists 
only one k for every vertex i such that Uik — 1), the 
fuzzified modularity Q/ is exactly the same as the crisp 
modularity Q. In order to determine the optimal number 
of fuzzy communities, we iteratively increase c and choose 
the one which results in the highest fuzzified modularity 
Qf. 



Parametrization of similarity and dissimilarity 
constraints 



Next, we discuss the appropriate choice of the remain- 
ing parameters (W and S). These parameters are not 
crucial to the final result of the algorithm, but they pro- 
vide a way to inject additional a priori knowledge into 
the algorithm. Note that the goal function (see Eq. [J) 
is a weighted sum of the difference between the desired 
and the calculated similarity values. The algorithm tries 
to minimize the difference by fitting the membership val- 
ues in an appropriate way. Without any further a priori 
knowledge, S is the adjacency matrix Aq and W is a 
matrix containing only ones. This means that the dot 
product of the membership vectors define the similari- 
ties, and the desired similarity is 1 for adjacent and 
for nonadjacent vertices, stating that the endpoints of 
the edges should be as similar as possible, while keeping 
disconnected edges dissimilar. The latter requirement is 
important: if we would only specify that the endpoints 
should be similar for connected vertex pairs, the algo- 
rithm would converge to a state where every vertex be- 
longs to the same community. 

Depending on the domain from which the network be- 
ing analyzed originates, there may be some additional 
knowledge about the original mechanism that created the 
network, or there may be some uncertainty in the data. 
W can be used to fine-tune the algorithm by making use 
of the domain-specific knowledge. The general purpose 
of Wij is to emphasize the connections where the calcu- 
lated similarity should match the expected one and skip 
the connections where it is hard or impossible to spec- 
ify an expected similarity. Wij can also be useful in the 
analysis of weighted networks. 

Consider a large friendship network as an example. In 
a friendship network, a reasonable assumption is that 
an existence of a connection between A and B predicts 
some kind of similarity between them. However, a miss- 
ing connection between A and B does not necessarily 
mean dissimilarity, it might happen that A and B did 
not have a chance to meet and form a connection! To 
account for this, one can assume that A is similar to its 
direct neighbors and dissimilar to its second order neigh- 
bors only (because they were likely to meet through their 
common acquaintances). If necessary, this assumption 
can be incorporated into Eq. [5] by setting the weight of 
the connections of A beyond its second order neighbors 
to zero. We call this modification the distance-based re- 
laxation of the model. For an illustration of the concept, 
see Fig. H 

The proper choice of Wij also allows us to analyze the 
community structure of networks with incomplete data. 
An example of this kind of a network is described in . 
A graph model of the visuo-tactile cortex of the macaque 
monkey was built based on the neural connections of the 
brain areas already documented in the literature. How- 
ever, there are actually two kinds of missing edges in this 
network: the absence of an edge between brain areas A 
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Figure 3: The idea of distance-based relaxation. Direct neigh- 
bors of vertex i (denoted by plus signs) are assumed to be 
similar to vertex i (shown in black): Sij — l,Wij > 0. Ver- 
tices at most k steps far from vertex i that are not direct 
neighbors (denoted by minus signs) are assumed to be dis- 
similar: Sij — 0, Wij > 0. Vertices being farther than k steps 
(denoted by dashed circles) have no similarity specification 
with respect to vertex i: Wij = 0. The figure illustrates the 
case of = 2. 



and B can either mean that the specific connection was 
tested for experimentally and found to be nonexistent or 
that the connection has not ever been sought for at all 
(due to, e.g., methodological difficulties). Our model can 
account for this difference by setting the weight of the 
suspected connections to zero and checking the similar- 
ity of the vertices involved after the analysis. We will 
discuss this later in Section [TVl 

We also note that several other similarity measures can 
also be used when one defines the expected similarity 
matrix S as long as the similarities are normalized into 
the range [0, 1]. Based only on the neighborhood of ver- 
tices, one can use the cosine similarity [23] or the Jac- 
card similarity index [24l |. More sophisticated, matrix- 
based methods have been studied in the papers of Jeh 
and Widom (H], Blondel et al ^ and Leicht et al 
Some of these measures are not normalized to the range 
[0, 1], but this can be done easily by using an appropri- 
ate transformation (e.g., dividing the similarities by the 
largest one found in the network). 



IV. BENCHMARKS AND APPLICATIONS 

Generally, the community structure of a network is not 
uniquely defined. Several partitions might exist that ap- 
proximate the underlying structure equally well, espe- 
cially if the network exhibits an overlapping or hierarchi- 
cal community structure. As shown in [li], overlapping 
communities are present in many networks ranging from 
co-authorship networks to protein interactions. We ex- 
pect our algorithm not only to discover this overlapping 
structure but also to exactly quantify the membership 
degree of each vertex in all of its communities. 

Unless stated otherwise, we parametrized our algo- 
rithm as follows: wij = 1 for all i,j and the desired 
similarity was 1 if vertices i and j were connected or 



i was equal to j, otherwise. The automatic selection 
of the bridges was achieved by the standardization of the 
bridgeness scores: a vertex was considered a bridge if 
its bridgeness score was at least one standard deviation 
higher than the mean bridgeness score of the vertices of 
the network. 



A. Benchmarks on computer-generated graphs 

We tested our method on several computer-generated 
networks with nonoverlapping and overlapping com- 
munity structure as well. Nonoverlapping community 
structures were generated on graphs with 1024 vertices 
grouped into four communities, each containing 256 ver- 
tices. Each vertex had an average of kin = 24 links to 
other vertices in the same community and an additional 
kout — 8 links to vertices from different communities. 
The generated graph had 16,384 edges and a density 
of 0.031. Overlapping communities were introduced by 
grouping the vertices into two communities and declar- 
ing 128 vertices in both communities as bridge vertices. 
Regular vertices kept their connectional patterns, having 
24 links on average to other vertices in their commu- 
nity and 8 links to the other community. Bridge vertices 
had 6 links to other vertices in their community, 12 links 
to other bridge vertices in their community, 6 links to 
bridge vertices of the other community and 8 links to 
regular vertices of the other community. The edge count 
and the density was equal to the nonoverlapping case. 
Fig. m shows a possible adjacency matrix for both the 
nonoverlapping and the overlapping case. 

In order to compare a fuzzy partition with an expected 
hard partition, we introduced the notion of dominant 
community. The dominant community of a vertex is the 
community to which it belongs to the greatest extent. 
Formally, community i is the dominant community of 
vertex j if Uij > max^ Ukj for 1 < A: < c. Out of 1000 
graphs with nonoverlapping community structures, the 
algorithm classified all vertices correctly in 97.4% of the 
test cases after converting the achieved fuzzy partition to 
its hard counterpart using the dominant communities. It 
was also able to infer the actual number of communities 
automatically in all cases using the fuzzified modular- 
ity. To further study the distribution of intra-community 
and inter-community edges, we varied the number of 
inter-community edges {kout) from to 24 while keep- 
ing kin + kout constant. When kout reaches 24, the graph 
practically becomes an Erdos-Renyi random graph de- 
void of any community structure, since the connectional 
probability between any two of the pre-defined commu- 
nities is equal. Fig. 5 (a) [ shows the results of the bench- 
mark. The quality of the calculated community struc- 
ture was assessed by the normalized mutual information 
as described in j28| . Interestingly, the performance of the 
algorithm degrades suddenly when the number of inter- 
community links exceeds 16. This is the point where on 
average there are more links between the communities 
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Figure 4: Adjacency matrices of graphs with nonoverlapping 
(top) and overlapping (bottom) community structure we used 
for benchmarking our algorithm. 



Figure 5: Panel (a) shows the performance of the algorithm 
for a graph with nonoverlapping community structure. Inter- 
cluster link count (kout) was varied while keeping the average 
degree {kin + kout) constant. The quality of the obtained 
result was measured using the normalized mutual information 
of the found and real communities [l^. Panel (b) shows the 
frequencies of bridgeness values in a graph with overlapping 
community structure. Thin line shows frequencies for regular 
nodes, thick line shows frequencies for bridge nodes. The bin 
width of the histogram was set to 0.01 (100 bins). 



cause of the randomized nature of this model, not aU 
bridge candidates became real bridges between the com- 
munities, but they had a significantly higher chance of 
becoming one. We used the bridgeness value introduced 
in Section III Bl to assess the quality of the results. We 
expected that bridge candidate vertices exhibit a differ- 
ent bridgeness score distribution than the regular ver- 
tices in the same graph. We also required that vertices 
identified as bridges by our algorithm should be among 
those that have been declared bridge candidates before 
test graph generation. We generated 1000 random graphs 
using this graph model and plotted the distribution of 
the bridgeness scores on Fig. 5(b) The different nature 
of the two distributions was supported by a Kolmogorov- 
Smirnov test (p- value less than 2.2 x 10"^^). Regular ver- 
tices usually had lower bridgeness scores than the bridge 
candidates, and we found that 92.8% of the identified 
bridges (based on their standardized bridgeness scores) 
were among bridge candidates, confirming that the al- 
gorithm is sensitive to the existence of overlaps between 
communities. 



Social and collaboration networks 



than inside them. 

Generated graphs with overlapping community struc- 
ture were used to test the sensitivity of the algorithm to 
vertices standing between communities. The model we 
used declared 128 vertices out of 512 in both communities 
as bridge candidates, and clearly distinguished them by 
their different connectional patterns: bridge candidates 
tended to connect to each other with a higher proba- 
bility than to the regular vertices in their communities, 
even if they originally belonged to different communities, 
creating an overlap between the two communities. Be- 



To evaluate the performance of our method on a real 
dataset, we used the social network of the academic staff 
of a given Faculty of a UK university consisting of three 
separate schools. The network structure was constructed 
from tic-strength measured with a questionnaire, where 
the items formed a reliable scale. Reliability was assessed 
by Cronbach's a (2^ . Our questionnaire achieved a Cron- 
bach's a of 0.91, suggesting high internal consistency and 
reliability. The questionnaire was completed by every 
member of the academic staff. In this study, we used 
the personal friendship network, ignoring the direction- 
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Figure 6: The fuzzy communities of the UK university faculty 
dataset. Panel (a), (b) and (c): vertices colored according 
to the membership functions of community 1, 2 and 3, re- 
spectively. Darker shades represent larger membership values. 
Panel (d): vertices colored according to the degree-corrected 
bridgeness scores. Darker shades represent higher bridgeness. 



ality and the weight of the edges. A fuzzy community 
detection for three communities was performed on the 
graph. To show the results in gra yscale, we deci ded t o 
draw three individual figures (Fig. 6(a) 6(b) and 6(c)), 
showing the values of the membership functions for com- 
munity 1, 2 and 3, respectively, using different shades 
of gray as fill colors for the vertices. Fig. |6(d)| shows 
the degree-corrected bridgeness values for each vertex. 
Other centrality measures resulted in the same corrected 
bridgeness scores after normalization. 



This dataset also contained explicit information re- 
garding the expected community structure, since for ev- 
ery node, we knew which school in the Faculty does it 
belong to. We defuzzified the results using the domi- 
nant communities for every vertex. The defuzzification 
revealed that all crisp communities consisted of almost 
exclusively the members of a single school inside the Fac- 
ulty. 75 out of 81 vertices were classified correctly, 4 
were misclassified (and all of them had a bridgeness value 
greater than 0.7), and there were 2 vertices for which no 
expectation was given because of lack of information in 
the questionnaire. It is also noteworthy that the maximal 
fuzzy modularity {Qf — 0.2826) was reached at c = 6, 
suggesting further subdivisions of the schools, although 
the improvement of the modularity compared to the case 
of c = 3 (Q/ = 0.2541) was not significant. 



Figure 7: Comparison of the uncorrected (left) and degree- 
corrected bridgeness scores (right) in the UK university 
dataset. Vertices are colored according to their respective 
bridgeness scores. Darker shades represent higher bridgeness 
scores. Note how the uncorrected bridgeness score correlates 
with the centrality of the vertices in their respective commu- 
nity. 



Degree-corrected bridgeness scores forc^3(Fig.[6(d)|) 
are particularly interesting. Highly scored individuals be- 
long to all three communities at the same time to some 
extent, maintaining connections to all of them. On the 
other hand, vertices with low degree-corrected bridge- 
ness scores can be thought as the cores of the communi- 
ties. We also notice that the peripheries of the communi- 
ties also belong almost equally to all of the communities 



(note the similar grey shades in Fig. 6(a) 6(b) and 6(c) 



for these vertices), but the degree-corrected bridgeness 
scores suppress this effect because of their low degree. 
The uncorrected and the degree-corrected scores are com- 
pared side-by-side on Fig. [71 We also point out that the 
uncorrected bridgeness scores can be used as a measure 
of the centrality of a given vertex with respect to its own 
dominant community by substracting it from 1. 

The next dataset we studied was the co-authorship 
network of scientists working on network theory and ex- 
periment, as published in [2l|. The network consists of 
1589 scientist and 2742 weighted, undirected connections. 
Edge weights are derived from the number of joint pub- 
lications: if author A and B share a paper where they 
are both authors and the paper has n total authors, this 
contributes by ^ to the total weight of the edge. We 
extracted the giant component of the network consisting 
of 379 scientists and 914 connections and let our algo- 
rithm determine the number of communities using the 
fuzzified modularity again. The optimum (Qf = 0.7082) 
was found with c = 30 communities. The value of c 
was confirmed by the visual inspection of the eigenval- 
ues of the Laplacian matrix. Without names, we observe 
that vertices with the highest centralities according to 
our measure were similar to the ones chosen by the com- 
munity centrality measure introduced in [2]| and mostly 
represented senior researchers of the field of network sci- 
ence. Bridges were detected by standardizing the bridge- 
ness values and considering vertices with a z-score higher 
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than 1 as bridges. The 31 bridge vertices were mostly 
post-doctorate researchers who collaborated with more 
than one senior researcher of the field. 



C. Cortical networks and the case of incomplete 
data 

To test how our method performs on graphs with miss- 
ing data (vertex pairs for which no information regard- 
ing their connectedness was known), we used the graph 
model of the macaque monkey's visuo-tactile cortex as 
published in . The graph consists of 45 vertices rep- 
resenting brain areas, and 463 directed connections rep- 
resenting neuronal pathways between the areas. Discon- 
nected vertices do not necessarily mean that there is no 
connection between them: some of them have been ex- 
plicitly tested for and found to be absent, others have 
simply not been tested for (but generally thought to be 
absent), and there are 13 vertex pairs in total where neu- 
roanatomists strongly suspect that there exists a con- 
nection between them. The graph itself consists of two 
distinct and mostly nonoverlapping communities corre- 
sponding to the visual and the somatosensory cortex. 
Other, anatomically meaningful subdivisions of the cor- 
tices (like the dorsal and the ventral stream in the visual 
cortex) are known as well. We also note that 11 out 
of the 13 suspected connections are heteromodal in the 
sense that they go between the visual and the somatosen- 
sory cortex. 

To account for the uncertainty and the directedness of 
the edges in the graph, we specified Wij as follows: Wij 
was if there was a nonreciprocal connection between 
area i and j (area i connected to j, but no pathway was 
found in the reverse direction) or if the connection was 
one of the suspected ones, otherwise wij was 1. The opti- 
mal fuzzy modularity (0.2766) was reached at c = 4. We 
examined the results for c = 2 and c = 4. The case of 
c — 2 classified the nodes correctly: all of the somatosen- 
sory areas were associated with the somatosensory cor- 
tex and most of the visual areas were associated with 
the visual cortex, except a few areas with a surprisingly 
high bridgeness (over 0.85). The vertex with the highest 
bridgeness (0.99) was area 46, a part of the dorsolateral 
prefrontal cortex, and it does not have functions related 
to low-level sensory information processing. Area 46 is 
rather a higher level (supramodal) area, which plays a 
role in sustaining attention and working memory, and 
being a bridge between the visual and the somatosensory 
cortex, it integrates visual, tactile and other information 
necessary for the above mentioned cognitive functions. 
Other relevant bridges found with c = 4 were area VIP 
(where the literature has already suggested that it should 
be split into two areas VIPm and VIPp, which establish 
stronger connections with visual or sensorimotor areas, 
respectively [sS]), LIP, V4 and 7a. VIP and LIP are 
involved with hand and eye coordination, respectively, 
and both of these functions require combined informa- 




Figure 8; The cortical network dataset [23 |. Rectangular ver- 
tices are visual areas, circular vertices are somatosensory ar- 
eas. Vertices are colored according to their degree-corrected 
bridgeness values for c = 4. Detected bridges are highlighted 
with white text color. 



tion from visual and tactile signals as well. Area 7a inte- 
grates visual, tactile and proprioceptive signals. Area F4 
was defined originally as the human color center [sT], |33] , 
while it was also suggested that a separated ensemble of 
V4: neurons successfully encode complex shapes based on 
the curvature of the shape boundaries (33j . The func- 
tional heterogenity is in accordance to the subdivision 
of V4: into different regions as suggested by Bartels et 
al i3^]. We conclude that the bridges we found are in 
concordance with the assumed higher level roles of these 
areas. Fuzzy community detection for c — A was also able 
to separate the dorsal and the ventral stream of the vi- 
sual cortex, only area 7a and VIP were misclassified, but 
they retained their bridgelike properties as well as area 
46. The degree-corrected bridgeness values for c = 4 are 
shown on Fig. [51 Plotting the uncorrected bridgeness val- 
ues versus a chosen centrality measure (in our case, the 
vertex degree), shown on Fig. [H] was found to be a useful 
visual aid for separating bridge vertices and outliers. 

To approximate the probability of the suspected con- 
nections, we calculated the pairwise similarities of the 
vertices involved and considered the similarity as the 
probability of the existence of a connection. This is 
based on the idea that one can consider the member- 
ship value Uij as the probability of vertex j belonging to 
community i. In this sense, the similarity of vertices i 
and j is the probability of the event that they are in the 
same community, and according to our prior assumption 
that similarity implies connectivity, we can think about 
higher similarity values as precursors for existing con- 
nections. Without going into further details and possi- 
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Figure 9: The degree-bridgeness plot of the vertices of the 
cortical network dataset for c = 4. Crosses denote regular 
vertices and triangles denote bridges. Bridges and the single 
significant outlier (area VOT) have also been marked with the 
name of the corresponding area. The remaining names were 
omitted for sake of clarity. VOT is a biologically relevant 
example of a vertex with low degree and high bridgeness. 



ble neuroanatomical implications, we concluded that all 
supposed connections of area LIP are less likely than 
the supposed connections of VIP, and among the possi- 
ble unknown connections of VIP, the connections with 
areas 4 and 6 are the most probable. 



D. Comparison with other overlapping community 
detection algorithms 

In order to compare our method with earlier attempts 
on tackling the problem of overlapping communities, we 
examined the CPM algorithm of Palla et al [T| , the spec- 
tral method of Capocci et al [4] and the fuzzy method of 
Zhang et al @ . We tested all three methods on the exam- 
ple graph shown on Fig. 1 (a) and on the macaque monkey 
dataset introduced in Section IIV CI For the CPM al- 
gorithm, we used the original implementation published 
by the authors at http://www.cfinder.org. The algo- 
rithm of Zhang et al had a weight exponent m controlling 
the degree of fuzzification, but since the authors provided 
no clue about the suggested value of the parameter, we 
used m = 2, which is the most typical choice of this pa- 
rameter in other known applications of the fuzzy c-means 
algorithm Q. 

The proper community structure of the example graph 
was detected by all algorithms we considered (includ- 
ing ours), although the spectral method of Capocci et al 
had to be tested on a different example graph with three 
cliques (each of size 4) and a single connector node, be- 
cause in the case of only two communities, the only eigen- 
vector that carries useful information is the first nontriv- 
ial one, rendering correlation calculations meaningless. 



Moreover, the global community structure became evi- 
dent only after proper rearrangement of the community 
closeness matrix provided the algorithm. The bridge- 
like property of the connector vertex was inferred from 
the zero community closeness values to all other vertices. 
The method of Zhang et al and our method produced the 
proper expected partition matrix with all the vertices ex- 
cept vertex 5 classified strictly to one community or the 
other, while vertex 5 belonging to both at the same time 
with a membership degree of 0.5. The method of Palla 
et al identified vertex 5 as an outlier vertex, but after 
adding more edges to it, it became an overlap between 
the communities. 

The community structure of the cortical graph seemed 
to be a harder problem for the algorithms. The method 
of Palla et al failed to discover the subdivision of the two 
main communities, only the visual and the somatosen- 
sory cortex was discovered when we used a clique size of 
5. Larger clique sizes resulted in the discovery of the cores 
of the two communities, but we were not able to recog- 
nize the subdivision of the dorsal and the ventral stream 
in the visual cortex. However, the algorithm identified 
three overlaps {VA, PITv and TF) for a clique size of 5 
and two other overlaps [LIP and VIP) for a clique size 
of 6. Three out of these five overlaps were identified by 
our algorithm as well. The community closeness matrix 
calculated by the method of Capocci et al was harder to 
interpret, but vertices V^4 and 46 clearly turned out to be 
bridges with zero community closenesses to many other 
vertices. The method of Zhang et al was highly sensitive 
on the exact value of parameter m, classifying 40% of the 
vertices as bridges for m = 2. (Since the method provides 
a membership matrix similar to ours, we used the stan- 
dardized bridgeness measure with a z-score threshold of 
1). Lowering the weight exponent to to = 1.3 identified 
vertices LIP, la and Ri as bridges. 

We found that the results of our algorithm with respect 
to community structure discovery and bridge identifica- 
tion do not contradict the results of existing methods, 
and all the bridges found by our algorithm were classified 
as bridges by at least one different method. The method 
of Capocci et al complements our algorithm especially 
well, since it discovers local communities around a given 
vertex using the community closeness degrees while our 
method provides useful insights into the global structure 
of the network being analyzed, also indicating the pres- 
ence of bridge vertices. 



V. CONCLUSION 

In this paper, we presented a fuzzy extension of clas- 
sical community detection algorithms based on the as- 
sumption that communities of complex networks are 
formed by vertices with graded commitments towards 
at least one community. Accordingly, every vertex is 
allowed to belong to multiple communities with differ- 
ent membership degrees, represented by a single real 
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value Uki G [0, 1] for each vertex i and community k. 
The U — [uki] matrix encodes the membership values 
in a compact form and allows us to define the similar- 
ities of the vertices as S = U-^U in its simplest form. 
The similarities are then optimized using gradient-based 
constrained optimization methods in order to make con- 
nected vertices similar and disconnected vertices dissim- 
ilar. Based on the results of the fuzzy community de- 
tection, we introduced a novel concept called bridgeness, 
which can be used to measure to what extent is a given 
vertex shared between the communities. Vertices with 
high bridgeness values were shown to be important in 
various complex networks, including (but not limited to) 
social networks, scientific collaboration networks and cor- 
tical networks. A transformed variant of bridgeness can 
be used as a centrality measure with respect to the dom- 
inant communities of a vertex. 

We emphasize that this algorithm is expected to be 
highly useful in the analysis of relatively small datasets 
(up to the magnitude of a thousand vertices). The reason 
is that the algorithm assumes that every vertex has the 



possibility to connect to all other vertices, and if they do 
not connect, they do that because they are of no use to 
each other. In very large networks, this assumption is not 
always realistic. However, the distance-based relaxation 
introduced in Section IIIII can still be used in these cases 
to account for the upper bound imposed on the distance 
of the potentially interacting vertices. 
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[35] Bridges described in this paper are not to be confused 
with the concept of cut edges which are sornetinies also 
referred as bridges in classical graph theory. Articulation 
points (vertices whose removal disconnects the remaining 
subgraph) bear more similarity to the concept of bridges 
described in this paper, but not all bridge vertices are 
articulation points. From the structural perspective the 
concept of bridge and bridgencss may be considered as a 
generalization of the notion of articulation point, suitably 
tailored to the problem of community detection. 

[36] The matrix form of this problem bears some similarity 
with the Cholesky decomposition. For positive weights, 
Dg(U) is zero if and only if S = U"^U. This would be 
easy to solve if U was an n x n matrix (meaning that 
the number of communities c is equal to the number of 



vertices n), and S was symmetric and positive-definite. 
Since none of these conditions hold, all that we can do 
is to minimize the difi'erence between S and U^U by 
finding an appropriate U. 

[37] Local maxima are easy to avoid by choosing an a^*^ that 
always decreases the value of the goal function in the next 
step. Saddle points and not too deep local minima can 
be avoided by randomly nuitating the acquired solution 
and see if the iteration converges back to the original, 
unmutated solution. 

[38] Other vector norms are also conceivable with different 
normalization factors to make the result span over the 
interval [0, 1]. 



